Heuristic Based Extraction of Causal Relations from Annotated Causal Cue
نویسندگان
چکیده
Heuristic Based Extraction of Causal Relations from Annotated Causal Cue Phrases By Matthew J. Hausknecht This work focuses on the detection and extraction of Causal Relations from open domain text starting with annotated Causal Cue Phrases (CCPs). It is argued that the problem of causality extraction should be decomposed into two distinct subtasks. First, it is necessary to identify Causal Cue Phrases (CCPs) inside of a body of text. Second, using these CCPs, the cause and effect phrases of each causal relation must be extracted. To prove that CCPs are an essential part of causality extraction, it is experimentally demonstrated that the accuracy of cause and effect phrase extraction dramatically increases when CCP knowledge is utilized. A 31% increase in accuracy of cause and effect phrase extraction of two equivalent CRF machine learning algorithms is found when simple, word-based knowledge of CCPs is taken into account. Furthermore, it is shown that cause and effect phrase extraction can be performed accurately and robustly without the aid of complex machine learning techniques. A simple, heuristic based extraction algorithm, centering around three distinct classes of CCPs, is introduced. This algorithm achieves an accuracy of 87% on the task of extracting cause and effect phrases. While the problem of identifying CCPs in open domain text is not addressed, it is hypothesized that this task is far easier than identifying cause and effect phrases alone because the space of all possible CCPs is far smaller than that of all causal relations. Finally, this work contributes a free, publicly accessible corpus explicitly annotated with both intra-sentential causal relations and corresponding Causal Cue Phrases. It is our hope that this resource may see future use as a standard corpus for the task of causality extraction. Heuristic Based Extraction of Causal Relations from Annotated Causal Cue Phrases
منابع مشابه
Building a Japanese Corpus of Temporal-Causal-Discourse Structures Based on SDRT for Extracting Causal Relations
This paper proposes a methodology for generating specialized Japanese data sets for the extraction of causal relations, in which temporal, causal and discourse relations at both the fact level and the epistemic level, are annotated. We applied our methodology to a number of text fragments taken from the Balanced Corpus of Contemporary Written Japanese. We evaluated the feasibility of our method...
متن کاملCatching the Common Cause: Extraction and Annotation of Causal Relations and their Participants
In this paper, we present a simple, yet effective method for the automatic identification and extraction of causal relations from text, based on a large EnglishGerman parallel corpus. The goal of this effort is to create a lexical resource for German causal relations. The resource will consist of a lexicon that describes constructions that trigger causality as well as the participants of the ca...
متن کاملCausal Relation Extraction Using Cue Phrase and Lexical Pair Probabilities
This work aims to extract causal relations that exist between two events expressed by noun phrases or sentences. The previous works for the causality made use of causal patterns such as causal verbs. We concentrate on the information obtained from other causal event pairs. If two event pairs share some lexical pairs and one of them is revealed to be causally related, the causal probability of a...
متن کاملExtracting Explicit and Implicit Causal Relations from Sparse, Domain-Specific Texts
Various supervised algorithms for mining causal relations from large corpora exist. These algorithms have focused on relations explicitly expressed with causal verbs, e.g. “to cause”. However, the challenges of extracting causal relations from domain-specific texts have been overlooked. Domain-specific texts are rife with causal relations that are implicitly expressed using verbal and non-verba...
متن کاملReal-Time intrusion detection alert correlation and attack scenario extraction based on the prerequisite consequence approach
Alert correlation systems attempt to discover the relations among alerts produced by one or more intrusion detection systems to determine the attack scenarios and their main motivations. In this paper a new IDS alert correlation method is proposed that can be used to detect attack scenarios in real-time. The proposed method is based on a causal approach due to the strength of causal methods in ...
متن کامل